Efficient Nonparametric Subgraph Detection Using Tree Shaped Priors

نویسندگان

  • Nannan Wu
  • Feng Chen
  • Jianxin Li
  • Baojian Zhou
  • Naren Ramakrishnan
چکیده

Non-parametric graph scan (NPGS) statistics are used to detect anomalous connected subgraphs on graphs, and have a wide variety of applications, such as disease outbreak detection, road traffic congestion detection, and event detection in social media. In contrast to traditional parametric scan statistics (e.g., the Kulldorff statistic), NPGS statistics are free of distributional assumptions and can be applied to heterogeneous graph data. In this paper, we make a number of contributions to the computational study of NPGS statistics. First, we present a novel reformulation of the problem as a sequence of Budget Price-Collecting Steiner Tree (BPCST) sub-problems. Second, we show that this reformulated problem is NP-hard for a large class of nonparametric statistic functions. Third, we further develop efficient exact and approximate algorithms for a special category of graphs in which the anomalous subgraphs can be reformulated in a fixed tree topology. Finally, using extensive experiments we demonstrate the performance of our proposed algorithms in two real-world application domains (water pollution detection in water sensor networks and spatial event detection in social media networks) and contrast against state-of-theart connected subgraph detection methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Efficient Approach to Event Detection and Forecasting in Dynamic Multivariate Social Media Networks

Anomalous subgraph detection has been successfully applied to event detection in social media. However, the subgraph detection problembecomes challenging when the social media network incorporates abundant attributes, which leads to a multivariate network. The multivariate characteristic makes most existing methods incapable to tackle this problem effectively and efficiently, as it involves joi...

متن کامل

Prépublications Du Laboratoire Maxiset Comparisons of Procedures, Application to Choosing Priors in a Bayesian Nonparametric Setting Maxiset Comparisons of Procedures, Application to Choosing Priors in a Bayesian Nonparametric Setting. *

In this paper our aim is to provide tools for easily calculating the maxisets of several procedures. Then we apply these results to perform a comparison between several Bayesian estimators in a non parametric setting. We obtain that many Bayesian rules can be described through a general behavior such as being shrinkage rules, limited, and/or elitist rules. This has consequences on their maxiset...

متن کامل

The Time-Marginalized Coalescent Prior for Hierarchical Clustering

We introduce a new prior for use in Nonparametric Bayesian Hierarchical Clustering. The prior is constructed by marginalizing out the time information of Kingman’s coalescent, providing a prior over tree structures which we call the Time-Marginalized Coalescent (TMC). This allows for models which factorize the tree structure and times, providing two benefits: more flexible priors may be constru...

متن کامل

A Bayesian Nonparametric Approach to Testing for Dependence Between Random Variables

Nonparametric and nonlinear measures of statistical dependence between pairs of random variables are important tools in modern data analysis. In particular the emergence of large data sets can now support the relaxation of linearity assumptions implicit in traditional association scores such as correlation. Here we describe a Bayesian nonparametric procedure that leads to a tractable, explicit ...

متن کامل

Nonparametric Function Estimation Using Overcomplete Dictionaries

We consider the problem of estimating an unknown function based on noisy data using nonparametric regression. One approach to this estimation problem is to represent the function in a series expansion using a linear combination of basis functions. Overcomplete dictionaries provide a larger, but redundant collection of generating elements than a basis, however, coefficients in the expansion are ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016